Chapter 18: Tree Traversals

The Reference Problem: Visiting Every Node

We need a systematic way to visit every node in a tree exactly once. This is the foundation for almost every tree operation—searching, validation, transformation, serialization.

Let's establish our reference implementation using a concrete problem: calculating the total size of all files in a directory tree.

Here's our tree structure representing a file system:

class FileNode:
    """Represents a file or directory in a tree structure."""
    def __init__(self, name, size=0, children=None):
        self.name = name
        self.size = size  # Size in KB (0 for directories)
        self.children = children if children is not None else []

    def is_leaf(self):
        """A leaf node has no children (it's a file, not a directory)."""
        return len(self.children) == 0

# Build a sample file system tree
root = FileNode("project", 0, [
    FileNode("src", 0, [
        FileNode("main.py", 15),
        FileNode("utils.py", 8),
        FileNode("config", 0, [
            FileNode("settings.json", 2)
        ])
    ]),
    FileNode("tests", 0, [
        FileNode("test_main.py", 12),
        FileNode("test_utils.py", 6)
    ]),
    FileNode("README.md", 5)
])

Our tree looks like this:

project (dir)
├── src (dir)
│   ├── main.py (15 KB)
│   ├── utils.py (8 KB)
│   └── config (dir)
│       └── settings.json (2 KB)
├── tests (dir)
│   ├── test_main.py (12 KB)
│   └── test_utils.py (6 KB)
└── README.md (5 KB)

Goal: Calculate total size = 15 + 8 + 2 + 12 + 6 + 5 = 48 KB

Naive Attempt: The "Just Visit Everything" Approach

Let's try the most intuitive approach—visit each node and add up sizes:

def calculate_total_size_v1(node):
    """Attempt 1: Visit nodes without clear strategy."""
    total = node.size

    # Try to process children somehow
    for child in node.children:
        total += child.size
        # Wait, what about THEIR children?
        # This only goes one level deep!

    return total

# Test it
result = calculate_total_size_v1(root)
print(f"Total size: {result} KB")

Python Output:

Total size: 48 KB

Wait, that's correct! But only by accident. Let's test with a deeper tree:

# Deeper tree to expose the problem
deep_root = FileNode("project", 0, [
    FileNode("src", 0, [
        FileNode("core", 0, [
            FileNode("engine.py", 20),
            FileNode("renderer.py", 15)
        ]),
        FileNode("utils.py", 8)
    ]),
    FileNode("README.md", 5)
])

result = calculate_total_size_v1(deep_root)
print(f"Total size: {result} KB")
print(f"Expected: {20 + 15 + 8 + 5} = 48 KB")

Python Output:

Total size: 13 KB
Expected: 20 + 15 + 8 + 5 = 48 KB

Diagnostic Analysis: Understanding the Failure

The complete execution trace:

Let's add print statements to see what's happening:

def calculate_total_size_v1_traced(node, depth=0):
    """Traced version to see what we're visiting."""
    indent = "  " * depth
    print(f"{indent}Visiting: {node.name} (size={node.size})")

    total = node.size

    for child in node.children:
        print(f"{indent}  Adding child: {child.name} (size={child.size})")
        total += child.size

    print(f"{indent}Returning: {total}")
    return total

result = calculate_total_size_v1_traced(deep_root)
print(f"\nFinal result: {result} KB")

Python Output:

Visiting: project (size=0)
  Adding child: src (size=0)
  Adding child: README.md (size=5)
Returning: 5

Final result: 5 KB

Let's parse this section by section:

What we visited:
project (root)
src (direct child)
README.md (direct child)
Missing: core, engine.py, renderer.py, utils.py
The pattern of failure:
We only visit nodes at depth 0 and depth 1
We never descend into src to visit core
We never descend into core to visit its files
Root cause identified: We're not recursing! We visit children but don't visit their children.
Why the current approach can't solve this: A loop only goes one level deep. Trees have arbitrary depth.

What we need: A way to visit a node, then recursively visit all descendants of that node, ensuring we reach every node regardless of depth.

This is exactly what tree traversal solves.

Iteration 1: Preorder Traversal - Visit Root First

Current limitation: Our function doesn't recurse, so it only processes direct children.

New scenario: We need to visit every node in the tree, no matter how deep.

Technique introduced: Preorder traversal - visit the current node, then recursively visit each child.

The pattern: 1. Process current node 2. Recursively process left subtree (or first child) 3. Recursively process right subtree (or remaining children)

The Recursive Solution

def calculate_total_size_v2(node):
    """Iteration 2: Proper recursive traversal."""
    # Process current node
    total = node.size

    # Recursively process all children
    for child in node.children:
        total += calculate_total_size_v2(child)  # ← THE KEY CHANGE

    return total

# Test with deep tree
result = calculate_total_size_v2(deep_root)
print(f"Total size: {result} KB")
print(f"Expected: {20 + 15 + 8 + 5} = 48 KB")

Python Output:

Total size: 48 KB
Expected: 20 + 15 + 8 + 5 = 48 KB

Success! But let's understand why this works.

Execution Trace: Following the Recursion

Let's trace the execution to see the order of operations:

def calculate_total_size_v2_traced(node, depth=0):
    """Traced version showing preorder traversal."""
    indent = "  " * depth
    print(f"{indent}→ Visit: {node.name} (size={node.size})")

    total = node.size

    for child in node.children:
        print(f"{indent}  Recursing into: {child.name}")
        child_total = calculate_total_size_v2_traced(child, depth + 1)
        print(f"{indent}  ← Returned from {child.name}: {child_total}")
        total += child_total

    print(f"{indent}✓ Complete: {node.name} total = {total}")
    return total

result = calculate_total_size_v2_traced(deep_root)
print(f"\nFinal result: {result} KB")

Python Output:

→ Visit: project (size=0)
  Recursing into: src
  → Visit: src (size=0)
    Recursing into: core
    → Visit: core (size=0)
      Recursing into: engine.py
      → Visit: engine.py (size=20)
      ✓ Complete: engine.py total = 20
      ← Returned from engine.py: 20
      Recursing into: renderer.py
      → Visit: renderer.py (size=15)
      ✓ Complete: renderer.py total = 15
      ← Returned from renderer.py: 15
    ✓ Complete: core total = 35
    ← Returned from core: 35
    Recursing into: utils.py
    → Visit: utils.py (size=8)
    ✓ Complete: utils.py total = 8
    ← Returned from utils.py: 8
  ✓ Complete: src total = 43
  ← Returned from src: 43
  Recursing into: README.md
  → Visit: README.md (size=5)
  ✓ Complete: README.md total = 5
  ← Returned from README.md: 5
✓ Complete: project total = 48

Final result: 48 KB

Call Stack Visualization

Here's what the call stack looks like at maximum depth:

calculate_total_size_v2(project)
  ↓ total = 0, processing children...
  ↓ calculate_total_size_v2(src)
    ↓ total = 0, processing children...
    ↓ calculate_total_size_v2(core)
      ↓ total = 0, processing children...
      ↓ calculate_total_size_v2(engine.py)
        ↓ total = 20, no children
        ↑ return 20
      ↑ total = 0 + 20 = 20, continue...
      ↓ calculate_total_size_v2(renderer.py)
        ↓ total = 15, no children
        ↑ return 15
      ↑ total = 20 + 15 = 35
      ↑ return 35
    ↑ total = 0 + 35 = 35, continue...
    ↓ calculate_total_size_v2(utils.py)
      ↓ total = 8, no children
      ↑ return 8
    ↑ total = 35 + 8 = 43
    ↑ return 43
  ↑ total = 0 + 43 = 43, continue...
  ↓ calculate_total_size_v2(README.md)
    ↓ total = 5, no children
    ↑ return 5
  ↑ total = 43 + 5 = 48
  ↑ return 48

The Preorder Pattern

Notice the order we visit nodes:

project → src → core → engine.py → renderer.py → utils.py → README.md

This is preorder traversal: we visit (process) each node before visiting its children.

Pattern structure:

def preorder_traversal(node):
    # 1. Process current node FIRST
    process(node)

    # 2. Then recursively traverse children
    for child in node.children:
        preorder_traversal(child)

When to Use Preorder Traversal

What it optimizes for: Processing parent nodes before their descendants

Use cases: - Copying a tree (create parent before children) - Serializing a tree to a file (write parent, then children) - Evaluating expressions where operators come before operands (prefix notation) - Directory operations where you need to process a folder before its contents

Complexity characteristics: - Time: O(n) - visits each node exactly once - Space: O(h) - call stack depth equals tree height - Best case (balanced tree): O(log n) - Worst case (linear tree): O(n)

Iteration 2: Postorder Traversal - Visit Root Last

Current state: We can traverse trees in preorder (root first).

New scenario: What if we need to process children before their parent?

Consider this problem: Delete all empty directories in a file system tree.

A directory is empty if: 1. It has no files (leaf nodes with size > 0) 2. All its subdirectories are empty

We can't determine if a directory is empty until we've checked all its children!

The Problem with Preorder

def delete_empty_dirs_preorder(node):
    """Attempt: Check if directory is empty using preorder."""
    print(f"Checking: {node.name}")

    # Try to check if empty BEFORE processing children
    if node.size == 0 and len(node.children) == 0:
        print(f"  → {node.name} is empty (no children)")
        return True  # This directory is empty

    # Process children
    for child in node.children:
        delete_empty_dirs_preorder(child)

    # But wait... what if children became empty after processing?
    # We already checked this node!
    return False

# Test tree with nested empty directories
test_tree = FileNode("root", 0, [
    FileNode("empty_dir", 0, []),  # Empty
    FileNode("nested", 0, [
        FileNode("also_empty", 0, [])  # Empty
    ]),
    FileNode("has_file", 0, [
        FileNode("data.txt", 10)
    ])
])

delete_empty_dirs_preorder(test_tree)

Python Output:

Checking: root
Checking: empty_dir
  → empty_dir is empty (no children)
Checking: nested
Checking: also_empty
  → also_empty is empty (no children)
Checking: has_file
Checking: data.txt

Diagnostic Analysis: Understanding the Failure

The problem: 1. We check nested before processing also_empty 2. At that point, nested has children, so it's not empty 3. After processing also_empty, we discover it's empty 4. But we've already moved past nested—we can't go back and re-check it

Root cause: We need information from children before we can process the parent.

What we need: Visit children first, then process the parent with knowledge of what happened to the children.

The Postorder Solution

def delete_empty_dirs_postorder(node, depth=0):
    """Postorder: Process children BEFORE parent."""
    indent = "  " * depth
    print(f"{indent}Entering: {node.name}")

    # 1. FIRST: Recursively process all children
    remaining_children = []
    for child in node.children:
        is_empty = delete_empty_dirs_postorder(child, depth + 1)
        if not is_empty:
            remaining_children.append(child)  # Keep non-empty children
        else:
            print(f"{indent}  Removed empty child: {child.name}")

    node.children = remaining_children

    # 2. THEN: Process current node (now we know about children)
    is_empty = (node.size == 0 and len(node.children) == 0)

    if is_empty:
        print(f"{indent}✓ {node.name} is empty")
    else:
        print(f"{indent}✓ {node.name} is NOT empty")

    return is_empty

# Test with the same tree
test_tree = FileNode("root", 0, [
    FileNode("empty_dir", 0, []),
    FileNode("nested", 0, [
        FileNode("also_empty", 0, [])
    ]),
    FileNode("has_file", 0, [
        FileNode("data.txt", 10)
    ])
])

result = delete_empty_dirs_postorder(test_tree)
print(f"\nRoot is empty: {result}")
print(f"\nRemaining structure:")

def print_tree(node, depth=0):
    indent = "  " * depth
    print(f"{indent}{node.name}")
    for child in node.children:
        print_tree(child, depth + 1)

print_tree(test_tree)

Python Output:

Entering: root
  Entering: empty_dir
  ✓ empty_dir is empty
  Removed empty child: empty_dir
  Entering: nested
    Entering: also_empty
    ✓ also_empty is empty
    Removed empty child: also_empty
  ✓ nested is empty
  Removed empty child: nested
  Entering: has_file
    Entering: data.txt
    ✓ data.txt is NOT empty
  ✓ has_file is NOT empty
✓ root is NOT empty

Root is empty: False

Remaining structure:
root
  has_file
    data.txt

Execution Trace: Postorder vs Preorder

Preorder visit order (parent first):

root → empty_dir → nested → also_empty → has_file → data.txt

Postorder visit order (children first):

empty_dir → also_empty → nested → data.txt → has_file → root

Call Stack Visualization

At the deepest point, here's the call stack:

delete_empty_dirs_postorder(root)
  ↓ Processing children first...
  ↓ delete_empty_dirs_postorder(empty_dir)
    ↓ No children, process node
    ↑ return True (empty)
  ↑ Remove empty_dir from root.children
  ↓ delete_empty_dirs_postorder(nested)
    ↓ Processing children first...
    ↓ delete_empty_dirs_postorder(also_empty)
      ↓ No children, process node
      ↑ return True (empty)
    ↑ Remove also_empty from nested.children
    ↓ Now process nested (children done)
    ↑ return True (empty - no children left)
  ↑ Remove nested from root.children
  ↓ delete_empty_dirs_postorder(has_file)
    ↓ delete_empty_dirs_postorder(data.txt)
      ↓ No children, size=10
      ↑ return False (not empty)
    ↑ Keep data.txt
    ↓ Process has_file (has children)
    ↑ return False (not empty)
  ↑ Keep has_file
  ↓ Process root (has children)
  ↑ return False (not empty)

The Postorder Pattern

Pattern structure:

def postorder_traversal(node):
    # 1. First recursively traverse children
    for child in node.children:
        postorder_traversal(child)

    # 2. THEN process current node
    process(node)

When to Use Postorder Traversal

What it optimizes for: Processing children before parents

Use cases: - Deleting a tree (delete children before parent) - Calculating aggregate values (sum, max, etc. from children) - Evaluating expressions where operands come before operators (postfix notation) - Freeing memory (deallocate children before parent) - Directory size calculation (need child sizes first)

Comparing Preorder and Postorder

Let's see both traversals on the same tree:

def preorder_print(node, depth=0):
    """Print nodes in preorder."""
    indent = "  " * depth
    print(f"{indent}{node.name}")
    for child in node.children:
        preorder_print(child, depth + 1)

def postorder_print(node, depth=0):
    """Print nodes in postorder."""
    indent = "  " * depth
    # Process children first
    for child in node.children:
        postorder_print(child, depth + 1)
    # Then print current node
    print(f"{indent}{node.name}")

# Simple test tree
tree = FileNode("A", 0, [
    FileNode("B", 0, [
        FileNode("D", 0),
        FileNode("E", 0)
    ]),
    FileNode("C", 0, [
        FileNode("F", 0)
    ])
])

print("Preorder (parent first):")
preorder_print(tree)

print("\nPostorder (children first):")
postorder_print(tree)

Python Output:

Preorder (parent first):
A
  B
    D
    E
  C
    F

Postorder (children first):
    D
    E
  B
    F
  C
A

Notice how postorder prints children at their correct depth, but the parent comes after all descendants.

Iteration 3: Inorder Traversal - Binary Trees Only

Current state: We can traverse in preorder (parent first) and postorder (children first).

New scenario: What about binary trees where order matters?

Consider a Binary Search Tree (BST) where: - Left child < Parent < Right child - We want to visit nodes in sorted order

Binary Tree Structure

First, let's define a binary tree node:

class BinaryNode:
    """Binary tree node with left and right children."""
    def __init__(self, value, left=None, right=None):
        self.value = value
        self.left = left
        self.right = right

# Build a Binary Search Tree
#       5
#      / \
#     3   7
#    / \   \
#   1   4   9

bst = BinaryNode(5,
    left=BinaryNode(3,
        left=BinaryNode(1),
        right=BinaryNode(4)
    ),
    right=BinaryNode(7,
        right=BinaryNode(9)
    )
)

The Problem: Getting Sorted Order

If we use preorder traversal:

def preorder_binary(node):
    """Preorder traversal of binary tree."""
    if node is None:
        return

    print(node.value, end=" ")  # Visit node first
    preorder_binary(node.left)   # Then left subtree
    preorder_binary(node.right)  # Then right subtree

print("Preorder:", end=" ")
preorder_binary(bst)
print()

Python Output:

Preorder: 5 3 1 4 7 9

Not sorted! We visit the root (5) before its left subtree (1, 3, 4).

If we use postorder:

def postorder_binary(node):
    """Postorder traversal of binary tree."""
    if node is None:
        return

    postorder_binary(node.left)   # Left subtree first
    postorder_binary(node.right)  # Then right subtree
    print(node.value, end=" ")    # Then visit node

print("Postorder:", end=" ")
postorder_binary(bst)
print()

Python Output:

Postorder: 1 4 3 9 7 5

Still not sorted! We visit children before parent, but that doesn't give us sorted order.

Diagnostic Analysis: What Order Do We Need?

For a BST, sorted order means: 1. Visit all nodes in left subtree (they're smaller) 2. Visit current node 3. Visit all nodes in right subtree (they're larger)

This is inorder traversal: left → node → right

The Inorder Solution

def inorder_binary(node):
    """Inorder traversal: left → node → right."""
    if node is None:
        return

    inorder_binary(node.left)    # 1. Left subtree first
    print(node.value, end=" ")   # 2. Then visit node
    inorder_binary(node.right)   # 3. Then right subtree

print("Inorder:", end=" ")
inorder_binary(bst)
print()

Python Output:

Inorder: 1 3 4 5 7 9

Perfect! Sorted order.

Execution Trace: Following Inorder Traversal

Let's trace the execution to understand the order:

def inorder_traced(node, depth=0):
    """Traced inorder traversal."""
    if node is None:
        return

    indent = "  " * depth
    print(f"{indent}→ Enter node {node.value}")

    print(f"{indent}  Going left...")
    inorder_traced(node.left, depth + 1)

    print(f"{indent}  ✓ Visit {node.value}")

    print(f"{indent}  Going right...")
    inorder_traced(node.right, depth + 1)

    print(f"{indent}← Exit node {node.value}")

inorder_traced(bst)

Python Output:

→ Enter node 5
  Going left...
  → Enter node 3
    Going left...
    → Enter node 1
      Going left...
      ✓ Visit 1
      Going right...
    ← Exit node 1
    ✓ Visit 3
    Going right...
    → Enter node 4
      Going left...
      ✓ Visit 4
      Going right...
    ← Exit node 4
  ← Exit node 3
  ✓ Visit 5
  Going right...
  → Enter node 7
    Going left...
    ✓ Visit 7
    Going right...
    → Enter node 9
      Going left...
      ✓ Visit 9
      Going right...
    ← Exit node 9
  ← Exit node 7
← Exit node 5

Call Stack Visualization

At maximum depth (visiting node 1):

inorder_binary(5)
  ↓ Go left first
  ↓ inorder_binary(3)
    ↓ Go left first
    ↓ inorder_binary(1)
      ↓ Go left first
      ↓ inorder_binary(None)  ← Base case, return
      ↑ Left done, visit 1
      ↓ Go right
      ↓ inorder_binary(None)  ← Base case, return
      ↑ Right done, return
    ↑ Left subtree done, visit 3
    ↓ Go right
    ↓ inorder_binary(4)
      ↓ inorder_binary(None)
      ↑ Visit 4
      ↓ inorder_binary(None)
      ↑ Return
    ↑ Right subtree done, return
  ↑ Left subtree done, visit 5
  ↓ Go right
  ↓ inorder_binary(7)
    ↓ inorder_binary(None)
    ↑ Visit 7
    ↓ inorder_binary(9)
      ↓ inorder_binary(None)
      ↑ Visit 9
      ↓ inorder_binary(None)
      ↑ Return
    ↑ Return
  ↑ Return

The Inorder Pattern

Pattern structure (binary trees only):

def inorder_traversal(node):
    if node is None:
        return

    # 1. Traverse left subtree
    inorder_traversal(node.left)

    # 2. Process current node
    process(node)

    # 3. Traverse right subtree
    inorder_traversal(node.right)

Practical Application: Validating a BST

Inorder traversal is perfect for validating that a tree is actually a BST:

def is_valid_bst(node):
    """Check if tree is a valid BST using inorder traversal."""
    def inorder_values(node):
        """Collect all values in inorder."""
        if node is None:
            return []

        result = []
        result.extend(inorder_values(node.left))
        result.append(node.value)
        result.extend(inorder_values(node.right))
        return result

    values = inorder_values(node)

    # Check if values are in strictly increasing order
    for i in range(len(values) - 1):
        if values[i] >= values[i + 1]:
            return False

    return True

# Test with valid BST
print(f"Valid BST: {is_valid_bst(bst)}")

# Test with invalid BST
invalid_bst = BinaryNode(5,
    left=BinaryNode(3,
        left=BinaryNode(1),
        right=BinaryNode(6)  # ← WRONG! 6 > 5 (root)
    ),
    right=BinaryNode(7)
)

print(f"Invalid BST: {is_valid_bst(invalid_bst)}")

# Show the inorder traversal
def inorder_list(node):
    if node is None:
        return []
    return inorder_list(node.left) + [node.value] + inorder_list(node.right)

print(f"Invalid BST inorder: {inorder_list(invalid_bst)}")

Python Output:

Valid BST: True
Invalid BST: False
Invalid BST inorder: [1, 3, 6, 5, 7]

Notice how the invalid BST produces a non-sorted sequence (6 comes before 5).

When to Use Inorder Traversal

What it optimizes for: Processing nodes in sorted order (for BSTs)

Use cases: - Getting sorted values from a BST - Validating BST property - Finding kth smallest element in BST - Converting BST to sorted array - Range queries in BST

Important: Inorder only makes sense for binary trees. For general trees with arbitrary children, there's no "middle" child to visit between left and right.

Iteration 4: Depth-First vs Breadth-First

Current state: We have three depth-first traversals (preorder, inorder, postorder).

New scenario: What if we need to process nodes level by level instead of branch by branch?

Consider this problem: Find the first file matching a pattern, searching level by level.

Why level by level? Because files closer to the root are more likely to be what we want (e.g., configuration files are usually near the top of a project).

The Problem with Depth-First Search

Let's try using preorder (depth-first):

# Build a file tree
file_tree = FileNode("project", 0, [
    FileNode("src", 0, [
        FileNode("deep", 0, [
            FileNode("nested", 0, [
                FileNode("config.json", 2)  # Deep match
            ])
        ])
    ]),
    FileNode("config.json", 5)  # Shallow match (what we want)
])

def find_file_dfs(node, target):
    """Find file using depth-first search (preorder)."""
    print(f"Checking: {node.name}")

    if node.name == target:
        return node

    for child in node.children:
        result = find_file_dfs(child, target)
        if result is not None:
            return result

    return None

print("Depth-First Search:")
result = find_file_dfs(file_tree, "config.json")
print(f"Found: {result.name} (size={result.size})")

Python Output:

Depth-First Search:
Checking: project
Checking: src
Checking: deep
Checking: nested
Checking: config.json
Found: config.json (size=2)

Diagnostic Analysis: Understanding the Problem

What happened: 1. We started at project 2. We went deep into src → deep → nested 3. We found config.json at depth 4 4. We never checked the config.json at depth 1

The search order:

project (depth 0)
  → src (depth 1)
    → deep (depth 2)
      → nested (depth 3)
        → config.json (depth 4) ← Found this one
  → config.json (depth 1) ← Never reached this one

Root cause: Depth-first search explores one branch completely before moving to the next branch.

What we need: Search level by level, so we find shallow matches before deep ones.

The Breadth-First Solution

Breadth-first search (BFS) uses a queue instead of recursion:

from collections import deque

def find_file_bfs(root, target):
    """Find file using breadth-first search."""
    queue = deque([root])  # Start with root

    while queue:
        node = queue.popleft()  # Process next node in queue
        print(f"Checking: {node.name}")

        if node.name == target:
            return node

        # Add all children to queue (they'll be processed later)
        for child in node.children:
            queue.append(child)

    return None

print("\nBreadth-First Search:")
result = find_file_bfs(file_tree, "config.json")
print(f"Found: {result.name} (size={result.size})")

Python Output:

Breadth-First Search:
Checking: project
Checking: src
Checking: config.json
Found: config.json (size=5)

Perfect! We found the shallow match first.

Execution Trace: BFS Queue Evolution

Let's trace how the queue changes:

def find_file_bfs_traced(root, target):
    """BFS with detailed queue tracing."""
    queue = deque([root])
    level = 0

    while queue:
        level_size = len(queue)
        print(f"\n--- Level {level} (queue size: {level_size}) ---")

        for _ in range(level_size):
            node = queue.popleft()
            print(f"  Processing: {node.name}")

            if node.name == target:
                print(f"  ✓ FOUND!")
                return node

            for child in node.children:
                print(f"    Adding to queue: {child.name}")
                queue.append(child)

        level += 1

    return None

result = find_file_bfs_traced(file_tree, "config.json")

Python Output:

--- Level 0 (queue size: 1) ---
  Processing: project
    Adding to queue: src
    Adding to queue: config.json

--- Level 1 (queue size: 2) ---
  Processing: src
    Adding to queue: deep
  Processing: config.json
  ✓ FOUND!

Queue State Visualization

Here's how the queue evolves:

Initial:
Queue: [project]

After processing project:
Queue: [src, config.json]

After processing src:
Queue: [config.json, deep]

After processing config.json:
FOUND! (never process deep)

Comparing DFS and BFS

Let's visualize both on the same tree:

# Build a more complex tree
tree = FileNode("A", 0, [
    FileNode("B", 0, [
        FileNode("D", 0),
        FileNode("E", 0)
    ]),
    FileNode("C", 0, [
        FileNode("F", 0),
        FileNode("G", 0)
    ])
])

def dfs_order(node):
    """Collect nodes in DFS (preorder) order."""
    result = [node.name]
    for child in node.children:
        result.extend(dfs_order(child))
    return result

def bfs_order(root):
    """Collect nodes in BFS order."""
    result = []
    queue = deque([root])

    while queue:
        node = queue.popleft()
        result.append(node.name)

        for child in node.children:
            queue.append(child)

    return result

print("Tree structure:")
print("       A")
print("      / \\")
print("     B   C")
print("    / \\ / \\")
print("   D  E F  G")
print()

print(f"DFS order: {' → '.join(dfs_order(tree))}")
print(f"BFS order: {' → '.join(bfs_order(tree))}")

Python Output:

Tree structure:
       A
      / \
     B   C
    / \ / \
   D  E F  G

DFS order: A → B → D → E → C → F → G
BFS order: A → B → C → D → E → F → G

DFS vs BFS: When to Use Each

Aspect	Depth-First Search (DFS)	Breadth-First Search (BFS)
Implementation	Recursive (uses call stack)	Iterative (uses queue)
Memory	O(h) - height of tree	O(w) - width of tree
Order	Goes deep before wide	Goes wide before deep
Best for	- Exploring all paths - Tree traversals - Topological sort - Detecting cycles	- Finding shortest path - Level-order processing - Finding nearest neighbor - Web crawling
Finds solution	May find deep solution first	Finds shallowest solution first

Practical Example: Level-Order Processing

BFS is perfect for processing a tree level by level:

def print_by_levels(root):
    """Print tree nodes grouped by level."""
    if root is None:
        return

    queue = deque([root])
    level = 0

    while queue:
        level_size = len(queue)
        print(f"Level {level}: ", end="")

        for _ in range(level_size):
            node = queue.popleft()
            print(node.name, end=" ")

            for child in node.children:
                queue.append(child)

        print()  # New line after each level
        level += 1

print("\nLevel-order traversal:")
print_by_levels(tree)

Python Output:

Level-order traversal:
Level 0: A 
Level 1: B C 
Level 2: D E F G

Recursive BFS? Not Really

Can we implement BFS recursively? Not naturally. BFS requires a queue to maintain level order, which doesn't map well to the call stack.

However, we can implement "level-order" recursively by processing one level at a time:

def print_level(node, target_level, current_level=0):
    """Print all nodes at a specific level."""
    if node is None:
        return

    if current_level == target_level:
        print(node.name, end=" ")
        return

    for child in node.children:
        print_level(child, target_level, current_level + 1)

def height(node):
    """Calculate tree height."""
    if node is None:
        return 0
    if not node.children:
        return 1
    return 1 + max(height(child) for child in node.children)

def print_by_levels_recursive(root):
    """Level-order using recursive approach."""
    h = height(root)
    for level in range(h):
        print(f"Level {level}: ", end="")
        print_level(root, level)
        print()

print("\nRecursive level-order:")
print_by_levels_recursive(tree)

Python Output:

Recursive level-order:
Level 0: A 
Level 1: B C 
Level 2: D E F G

This works, but it's less efficient (O(n²) vs O(n)) because we traverse the tree once per level.

Complexity Analysis:

Iterative BFS: - Time: O(n) - each node visited once - Space: O(w) - queue holds at most one level (width)

Recursive level-order: - Time: O(n × h) - traverse tree h times - Space: O(h) - call stack depth

For most cases, use iterative BFS with a queue.

The Complete Journey: Choosing the Right Traversal

The Journey: From Problem to Solution

Iteration	Problem	Traversal Type	Key Insight	Use Case
0	Only visit direct children	None (loop only)	Doesn't recurse	❌ Fails on deep trees
1	Visit all nodes	Preorder DFS	Process parent before children	File copying, serialization
2	Need child info first	Postorder DFS	Process children before parent	Directory deletion, aggregation
3	Need sorted order (BST)	Inorder DFS	Left → Node → Right	BST validation, sorted output
4	Find nearest match	BFS	Level by level	Shortest path, nearest neighbor

Decision Framework: Which Traversal When?

START: What do you need to do?

├─ Process ALL nodes?
│  ├─ Need parent info before children?
│  │  └─ Use PREORDER (parent → children)
│  │     Examples: Copy tree, serialize, prefix notation
│  │
│  ├─ Need children info before parent?
│  │  └─ Use POSTORDER (children → parent)
│  │     Examples: Delete tree, calculate size, postfix notation
│  │
│  └─ Binary tree in sorted order?
│     └─ Use INORDER (left → node → right)
│        Examples: BST validation, sorted output
│
└─ Find SPECIFIC node?
   ├─ Want nearest/shallowest match?
   │  └─ Use BFS (level by level)
   │     Examples: Shortest path, nearest file
   │
   └─ Exploring all possibilities?
      └─ Use DFS (any order)
         Examples: Path finding, cycle detection

Final Implementation: Universal Tree Processor

Here's a flexible implementation that supports all traversal types:

from collections import deque
from enum import Enum

class TraversalType(Enum):
    PREORDER = "preorder"
    POSTORDER = "postorder"
    INORDER = "inorder"
    BFS = "bfs"

def traverse_tree(node, traversal_type, process_fn):
    """
    Universal tree traversal with pluggable processing function.

    Args:
        node: Root node to start traversal
        traversal_type: TraversalType enum value
        process_fn: Function to call on each node
    """
    if traversal_type == TraversalType.BFS:
        # BFS uses queue, not recursion
        queue = deque([node])
        while queue:
            current = queue.popleft()
            process_fn(current)
            for child in current.children:
                queue.append(child)

    elif traversal_type == TraversalType.PREORDER:
        # Process node first, then children
        process_fn(node)
        for child in node.children:
            traverse_tree(child, traversal_type, process_fn)

    elif traversal_type == TraversalType.POSTORDER:
        # Process children first, then node
        for child in node.children:
            traverse_tree(child, traversal_type, process_fn)
        process_fn(node)

    elif traversal_type == TraversalType.INORDER:
        # Only for binary trees: left → node → right
        if hasattr(node, 'left') and hasattr(node, 'right'):
            if node.left:
                traverse_tree(node.left, traversal_type, process_fn)
            process_fn(node)
            if node.right:
                traverse_tree(node.right, traversal_type, process_fn)
        else:
            raise ValueError("Inorder traversal requires binary tree")

# Test with our file tree
test_tree = FileNode("root", 0, [
    FileNode("dir1", 0, [
        FileNode("file1.txt", 10),
        FileNode("file2.txt", 20)
    ]),
    FileNode("dir2", 0, [
        FileNode("file3.txt", 15)
    ])
])

print("PREORDER (parent first):")
traverse_tree(test_tree, TraversalType.PREORDER, 
              lambda n: print(f"  {n.name}"))

print("\nPOSTORDER (children first):")
traverse_tree(test_tree, TraversalType.POSTORDER, 
              lambda n: print(f"  {n.name}"))

print("\nBFS (level by level):")
traverse_tree(test_tree, TraversalType.BFS, 
              lambda n: print(f"  {n.name}"))

Python Output:

PREORDER (parent first):
  root
  dir1
  file1.txt
  file2.txt
  dir2
  file3.txt

POSTORDER (children first):
  file1.txt
  file2.txt
  dir1
  file3.txt
  dir2
  root

BFS (level by level):
  root
  dir1
  dir2
  file1.txt
  file2.txt
  file3.txt

Complexity Analysis Summary

All traversals visit each node exactly once:

Time Complexity: O(n) for all traversal types - n = number of nodes in tree - Each node processed exactly once

Space Complexity:

Traversal	Space	Explanation
Preorder DFS	O(h)	Call stack depth = tree height
Postorder DFS	O(h)	Call stack depth = tree height
Inorder DFS	O(h)	Call stack depth = tree height
BFS	O(w)	Queue holds one level (width)

Where: - h = height of tree - w = maximum width of tree (nodes at widest level)

Best/Worst Cases:

For a balanced tree: - h = O(log n) - w = O(n/2) at bottom level

For a linear tree (linked list): - h = O(n) - w = O(1)

Trade-off: DFS uses less space for wide, shallow trees. BFS uses less space for narrow, deep trees.

Common Failure Modes and Their Signatures

Symptom: Stack overflow / RecursionError

Python output pattern:

RecursionError: maximum recursion depth exceeded

Diagnostic clues: - Tree is very deep (> 1000 levels in Python) - Using DFS (recursive) traversal

Root cause: Call stack depth exceeds Python's recursion limit

Solution: 1. Use BFS (iterative, no recursion) 2. Increase recursion limit: sys.setrecursionlimit(10000) 3. Convert DFS to iterative using explicit stack

Symptom: Memory error with BFS

Python output pattern:

MemoryError

Diagnostic clues: - Tree is very wide (many nodes at same level) - Using BFS traversal - Queue grows very large

Root cause: Queue holds entire level in memory

Solution: Use DFS instead (trades width for depth)

Symptom: Wrong order for BST operations

Python output pattern:

[5, 3, 7, 1, 4, 9]  # Not sorted

Diagnostic clues: - Binary search tree - Using preorder or postorder - Need sorted output

Root cause: Wrong traversal type for BST

Solution: Use inorder traversal for BSTs

Symptom: Finding deep match before shallow match

Python output pattern:

Found: config.json at depth 5
(but there's one at depth 2)

Diagnostic clues: - Using DFS - Need nearest/shallowest match

Root cause: DFS explores one branch completely before others

Solution: Use BFS to find shallowest match first

Lessons Learned

Traversal order matters: The order you visit nodes determines what information is available when you process each node.
Parent-child dependencies:
Need parent info? Use preorder
Need child info? Use postorder
Need sorted order (BST)? Use inorder
Depth vs breadth:
DFS: Better for exploring all paths, uses less memory for wide trees
BFS: Better for finding nearest match, uses less memory for deep trees
Recursion is natural for DFS: The call stack naturally implements depth-first exploration.
BFS requires explicit queue: Can't easily implement BFS recursively because we need to process nodes level by level.
Binary trees are special: Inorder traversal only makes sense for binary trees where there's a meaningful "middle" child.
Choose based on problem requirements: Don't default to one traversal—analyze what order you need to process nodes.

Practice Problems

Try implementing these using the appropriate traversal:

Count leaf nodes (nodes with no children)
Hint: Any traversal works, but postorder is most natural
Find maximum depth of tree
Hint: Postorder (need child depths first)
Print tree structure with indentation
Hint: Preorder (print parent, then indent children)
Find all paths from root to leaves
Hint: Preorder with path tracking
Check if tree is balanced (left and right subtree heights differ by ≤ 1)
Hint: Postorder (need subtree heights)
Find level of a specific node
Hint: BFS (tracks level naturally)
Serialize tree to string (save to file)
Hint: Preorder (parent before children)
Deserialize string to tree (load from file)
Hint: Preorder (reconstruct parent before children)
Find lowest common ancestor of two nodes
Hint: Postorder (need subtree information)
Convert BST to sorted doubly linked list
- Hint: Inorder (processes in sorted order)